Automatic Management of Unix/Linux systems: Log Management

Source: Internet
Author: User
Tags syslog

Introduction to Aix Error Log and Its Automatic Monitoring Mechanism

Most Unix/Linux systems use syslog as the system log mode. AIX also supports the Syslog mechanism. However, logs related to the AIX operating system and its main applications use error logs to record logs, only a small number of applications use syslog. The functions and configurations of Aix Syslog and Linux Syslog are very similar. We will not repeat them here.

The Aix Error Log Mechanism is part of the AIX basic system (base operating system). By default, the AIX Error Log mechanism can be used without any configuration.

AIX Error Log Mechanism component

The Aix Error Log Mechanism consists of the following parts:

  1. Device File/dev/error: used to receive logs generated by the kernel and applications.
  2. Daemon/usr/lib/errdemon: it is automatically started during system initialization. It monitors the log information transmitted by the kernel and applications to the device file/dev/error, and include the log information into the log file.
  3. Log file/var/adm/RAS/errlog: Default log file. You can use the command/usr/lib/errdemon-I to configure the log file location.
  4. Auxiliary program: In addition to device files, daemon processes, and log files, AIX error log provides a wide range of auxiliary programs to configure, operate, analyze, and generate reports on error logs. The following sections describe the auxiliary programs in detail.

AIX Error Log Configuration

The Aix error log can be used without any configuration, and the default configuration can basically meet the needs of various scenarios, but AIX still provides the configuration interface. You can use the configuration interface to modify the buffer size of the device file/dev/error, the location of the log file, the size limit of the log file, and the processing of duplicate logs. The Aix error log is configured using the command/usr/lib/errdemon.

  1. Modify the buffer size of the Error Log Device

    The Error Log Device/dev/error is a block device and must be read and written using a buffer zone. By default, the buffer size of/dev/error is 8 KB. We can configure the buffer size of/dev/error through/usr/lib/errdemon-B. If the buffer size of the new configuration is greater than the existing configuration, the new configuration will take effect immediately. If the buffer size of the new configuration is smaller than the existing configuration, the new configuration will take effect after errdemon restarts.

  2. Configure the log file path

    By default, the AIX Error Log uses the file/var/adm/RAS/errlog to store log information. You can configure the path of the Error Log File Using/usr/lib/errdemon-I. The new log file path takes effect immediately.

  3. Configure the log file size limit

    The log file size of the AIX error log is configurable. The configuration command is/usr/lib/errdemon-s. If the size of the new log file is greater than the existing configuration, the new configuration will take effect immediately. If the size of the new log file is smaller than the existing configuration, the existing log file will be backed up *. then errdemon will generate a new log file with the new log file size configuration.

  4. Configure repeated entries

    If duplicate information or errors occur in the operating system or application, duplicate entries will occur in the error log. The Aix error log will process repeated entries accordingly. For example, entries with the same content will be marked as duplicates within a certain period of time, if the number of duplicate entries exceeds the preset threshold, the new duplicate entries are no longer recorded in the error log as repeated entries. Whether to enable the repeat entry processing function. The configuration parameter is/usr/lib/errdemon-d, and the time interval configuration parameter is/usr/lib/errdemon-t, the configuration parameter for the maximum number of repeated entries is/usr/lib/errdemon-M.

AIX error log usage

After the AIX Error Log Mechanism is enabled, the operating system or application will use the AIX error log to record the events or errors. This section describes common commands for daily use of Aix Error Log and their usage. The use of the AIX Error Log mainly includes generating the error log report and deleting the error log entries.

Generate an AIX error log report

The Aix command errpt can be used to generate an error log report. errpt provides a wide range of parameters to specify the source data range and report format. For example, the-D parameter can be used to specify to display only specific error types. The-s and-e parameters can specify log entries within a specific time range, the-l parameter can be used to specify the log entries of a specific sequence number. The-a parameter can be used to specify the detailed information of log entries. For specific errpt usage, see errpt's manpage.

Here is an example of errpt, and this example describes the meaning of each field in the AIX error log entry.


Listing 1. errpt command output example

  #errpt -a -l 2  ---------------------------------------------------------  LABEL:          REBOOT_ID  IDENTIFIER:     2BFA76F6  Date/Time:       Mon Mar  2 21:38:21 2009  Sequence Number: 2  Machine Id:      00C0DD724C00  Node Id:         p6ml4n05  Class:           S  Type:            TEMP  WPAR:            Global  Resource Name:   SYSPROC  Description  SYSTEM SHUTDOWN BY USER  Probable Causes  SYSTEM SHUTDOWN  Detail Data  USER ID            0  0=SOFT IPL 1=HALT 2=TIME REBOOT            0  TIME TO REBOOT (FOR TIMED REBOOT ONLY)            0  # 

The main fields are described as follows:

Identifier: Digital identifier of the event

Date/time: date and time when the event occurred

Sequence Number: Event serial number

Machine ID: the node processor identifier for this event

Node ID: name of the node where this event occurs

Class: the type of the event. Currently, AIX Error Log supports the following types:

H: hardware

S: Software

O: informational entries

U: The event category cannot be determined.

Type: Event severity. Currently, AIX Error Log supports the following event severity levels:

Pend: the device or component is about to expire

Perf: the performance of the device or component is already below the acceptable threshold.

Perm: the error cannot be fixed. Perm is the most serious of all errors. Perm logs often indicate that a hardware or software component has failed and cannot be repaired.

Temp: an error is successfully fixed after several failures. Temp can also be used to identify informational entries.

Unkn: the severity of the event cannot be determined.

Info: information, not error

Resource Name: component name that generates information

Description: Brief description of the event.

Probable causes: possible causes of the event

Delete An AIX Error Log

You can use the errclear command to delete an AIX error log entry. errclear also provides options to specify the deletion range. For example, if-D is used to delete only events of a specific category, -l indicates that only entries with specific sequence numbers are deleted. Generally, errclear can be used as a Cron entry for periodic execution to identify the error log file.

Manually generate an AIX Error Log Entry

The command errlogger can be used to manually generate an AIX error log entry. Manually generated AIX error log entries can be used to test the AIX error log function or to test the automatic monitoring function to be discussed below.

Automatic AIX error log monitoring

The AIX operating system provides an error notification mechanism for Error Log through the ODM errnotify class. You can add an errnotify ODM class instance to Implement error notification for the AIX error log. That is, when the AIX Error Log Mechanism records error entries, errnotify will call a pre-defined command to notify the system administrator or perform other repair actions.

The Error Notification Mechanism of Aix Error Log fully complies with the monitoring automation requirements. We can monitor the AIX error log by adding an errnotify instance, when a defined error occurs, errnotify calls the corresponding command to notify the system administrator. Note that only the root user can add the errnotify class instance.

The following describes how to add an errnotify instance and use the errnotify mechanism to automatically monitor the AIX Error Log:

  1. Generate a stanza file for the ODM errnotify class instance

    You must specify several options for generating the stanza file of an ODM errnotify instance. The main options are described as follows:

    En_name: name of the errnotify class instance, which must be globally unique.

    En_persistenceflg: whether to automatically delete the instance when the system restarts. 0 indicates that the instance is automatically deleted when the system restarts, and 1 indicates that the instance is not deleted when the system restarts.

    En_type: specify the number of error log entries that only monitor a specific severity, such as info and pend.

    En_class: Specifies to only monitor error log entries of a specific category, such as H and S.

    En_method: defines the action taken after the AIX error log entry is monitored. The action can be any script program or operating system command. Errnotify automatically sets variables related to error log entries for the Monitoring Program:

    $1 error log entry serial number

    $2 error log entry ID

    $3 event category of the Error Log Entry

    $4 severity of an error log entry

    $5 alert flags of the error log entries

    $6 component name generated by the error log entry

    $7 types of components generated by error log entries)

    $8 component class generated by the error log entry)

    $9 error log entry error tag

    For more information about the fields of errnotify instances, see the AIX documentation general programming concepts: Writing and debugging programs.

    For example, we can generate an errnotify class instance stanza as follows:

    Listing 2. errnotify class instance stanza

     errnotify:    en_name = "errlog_notify"   en_persistenceflg = 1    en_method = "mail -s /"Events occured in Error log: sequence = $1 error_id = $2   class = $3 type = $4 alert_flags = $5 res_name = $6 res_type = $7 res_class = $8   label = $9/" root"

    The content of stanza can be interpreted as an email sent to the root user when any AIX error log event occurs. The content of the email contains the specific information of the error log entry.

  2. Add a class instance to the ODM Database

    The Aix command odmadd can be used to add an ODM instance to an ODM database.

    For example, odmadd/errpolicystanza

  3. Verify the ODM errnotify instance

    You can run the odmget command to verify that the errnotify class instance has been correctly added.

    Listing 3. Use the odmget command to view the errnotify class instance

     [node01][/]> odmget -q en_name="errlog_notify" errnotify errnotify:      en_pid = 0      en_name = "errlog_notify"     en_persistenceflg = 1      en_label = ""     en_crcid = 0      en_class = ""     en_type = ""     en_alertflg = ""     en_resource = ""     en_rtype = ""     en_rclass = ""     en_symptom = ""     en_err64 = ""     en_dup = ""     en_method = "mail -s /"Events occured in Error log: sequence = $1 error_id = $2     class = $3 type = $4 alert_flags = $5 res_name = $6 res_type = $7 res_class = $8     label = $9  contents = /n`errpt -a -l $1/n/" root"
  4. Manually generate error log entries to test whether monitoring works

    After you confirm that the errnotify instance has been correctly added, you can use errologger to manually generate an error log entry to test whether monitoring works. For example:

    Listing 4. Use the errlogger command to generate test logs

      errlogger "this is a test for Error log monitoring"

    Then, you can check whether the error log has been received in the root email. If the root email is working properly, you will receive the following message:

    Listing 5. Mail commands

     Message 37: From root Fri Mar 20 02:43:15 2009 Date: Fri, 20 Mar 2009 02:43:15 -0400 From: root To: root Subject: Events occured in Error log: sequence = 142983 error_id = 0xaa8ab241 class = O type = TEMP alert_flags = FALSE res_name = OPERATOR res_type = NONE res_class = NONE label = OPMSG  contents =  ---------------------------------------------------------------------------  LABEL:          OPMSG  IDENTIFIER:     AA8AB241  Date/Time:       Fri Mar 20 02:43:14 EDT 2009  Sequence Number: 142983  Machine Id:      000181404C00  Node Id:         hacsmdev3  Class:           O  Type:            TEMP  Resource Name:   OPERATOR  Description  OPERATOR NOTIFICATION  User Causes  ERRLOGGER COMMAND         Recommended Actions         REVIEW DETAILED DATA  Detail Data  MESSAGE FROM ERRLOGGER COMMAND  this is a test for Error log monitoring 
  5. Stop monitoring AIX Error Log

    To stop using errnotify to monitor AIX error logs, you only need to delete the errnotify instance from the ODM database.

    Listing 6. Deleting an errnotify instance

      [node01][/]> odmdelete -q en_name="errlog_notify" -o errnotify  1 objects deleted 


Introduction to Linux Syslog/syslog-ng and Its Automatic Monitoring Mechanism

Most Linux systems use the Syslog mechanism to record system logs. It is flexible and allows the system to take different actions based on different log configurations. Generally, syslog can classify logs by the subsystem and Information priority that generates logs, including writing log entries to a file, a device, or sending a message to users. It can record both local events and events on another host through the network.

However, as more and more applications run in the system, a sub-system may be used by multiple applications at the same time, which leads to some unimportant information to mask important information, the log-generating subsystem and Information priority alone cannot clearly identify information that is of interest to system administrators.

Syslog-ng (next-generation system log tool) came into being. One of its design principles is to establish a better message filter granularity. Syslog-ng can completely replace the syslog service and implement better filtering by defining rules.

The following describes the Syslog and syslog-ng mechanisms.

Linux Syslog/syslog-ng mechanism component

The Linux Syslog/syslog-ng mechanism consists of the following parts:

  1. Device File/dev/log: used to receive logs generated by the kernel and applications.
  2. Daemon: automatically starts during system initialization, monitors the log information transmitted by the kernel and applications to the device file/dev/log, and counts the log information into the log file.
  3. In the Syslog mechanism,/sbin/syslogd; In the syslog-ng mechanism,/sbin/syslog-ng.
  4. Configuration File: Provides configuration information for the daemon. It is read when the program starts and is used to specify logging rules.

The default location of the configuration file in the Syslog mechanism is/etc/syslog. conf;/etc/syslog-ng/syslog-ng.conf In the syslog-ng mechanism.

Log file/var/log/messages: Default log file. Logs with a higher priority than info for devices other than emails are recorded.

Configure and use Linux Syslog/syslog-ng

In Linux, syslog/syslog-ng configurations are simpler and more flexible than Aix. You only need to manually modify syslog. after the configuration entry in the conf/syslog-ng.conf file, restart the syslog service to complete, the following describes the configuration method:

  • The format of the/etc/syslog. conf configuration file in the Syslog mechanism is as follows:

    Facility. Level Action

    Facility. level is also called selector. Facility refers to the subsystem that generates logs, and level refers to the log level.

    For example:

    Authpriv. */var/log/secure

    Indicates that all priority log entries from the subsystem authpriv are written to the/var/log/secure file.

    Facility can be set to one of the following keywords:

    Auth authentication activities reported by pam_pwdb.

    Authpriv authentication activities including private information (such as user names)

    Cron information related to cron and.

    Daemon information related to the inetd daemon.

    FTP-related information

    Kern kernel information is first transmitted through klogd.

    LPR information related to the print service.

    Email-related information

    The internal function of Mark syslog is used to generate a timestamp.

    News Information from the news server

    Syslog information generated by Syslog

    Information generated by user programs

    Uucp information generated by uucp

    Local0 ~ Local7 is used by a custom program. For example, local5 is used as the SSH function.

    * Wildcard represents all functions except mark

    Level can be set to one of the following keywords (descending order, increasing severity ):

    Emerg system unavailable

    Conditions that alert needs to be modified immediately

    Crit prevents incorrect conditions for certain tool or subsystem Functions

    Error conditions for implementing err block functions of tools or some subsystems

    Warning warning information

    General conditions that notice is important

    Info message

    Debug does not contain other information about function conditions or problems.

    None has no priority. It is usually used for troubleshooting.

    * All levels except none

  • The rule format for the configuration file/etc/syslog-ng/syslog-ng.conf In the syslog-ng mechanism is as follows:

    Log {source S1; source S2;... filter F1; filter F2;... destination D1; destination D2 ;...};

    Here, source is the source identifier, filter is the filter identifier, and destination is the destination identifier.

    For example:

    SourceSRC {Unix-dgram ("/dev/log ");};

    FilterF_warn {level (warn, err, crit );};

    DestinationWarn {file ("/var/log/warn" fsync (yes ));};

    Log{Source (SRC); filter (f_warn); destination (warn );};

    Filter all log information received by the/dev/log device, and write logs with the priority of warn, err, and crit to the/var/log/warn file.

    You can choose to use the source/filter/destination rule defined by the system, or you can define customized configuration items in the specified format. Generally, we only create a custom destination and use it with the source and filter provided by the system. For how these configuration items are defined, see syslog-ng.conf manpage.

Linux Syslog-ng monitoring automation

In Linux, logs of all subsystems except iptables, news, and mail are stored in the/var/log/messages file by default. As more and more applications are running in the system, logs generated by related sub-systems will become more and more complex, separating useful information from system logs and implementing automatic monitoring of the information often gives the system administrator a headache.

The syslog-ng Service provides a LOG filter mechanism with a smaller message granularity and a more flexible definition. You can define different sources, filters, and destinations based on different needs, in this way, the specific log information is stored in a specific location. At the same time, the system administrator can create an automatic script to further process these log files. For example, you can regularly monitor system error messages and notify the system administrator by email.

The following describes how to automatically monitor system errors in specific log information:

  1. Define filter and destination as needed

    We have discussed the definition rules for source, filter, and destination in/etc/syslog-ng/syslog-ng.conf files in Linux Syslog-ng, assuming that the current application uses local6 for network management, we need to monitor and manage system messages whose priority is err (error) from this subsystem.

    Then, we need to define the filter in the/etc/syslog-ng/syslog-ng.conf file as follows:

    FilterF_local6err {level (ERR) and facility (local6 );};

    Define destination as follows:

    DestinationLocal6err {pipe ("/var/log/local6.err" group (Root) perm (0644 ));};

    The destination is/var/log/local6.err. The queue belongs to the root group and its access attribute is 0644.

  2. Set log configuration entries

    After defining filter and destination, we need to combine them to form a syslog-ng log configuration entry, for example:

    Log{Source (SRC); filter (F _ local6err); destination (local6err );};

    Then restart the syslog-ng service to make the new configuration take effect:

    #/Etc/init. d/syslog restart

    Shutting down syslog Services done

    Starting syslog Services done

    #

    In this way, all err logs from the local6 sub-system will be stored in the/var/log/local6.err queue. Here, the purpose of setting the log to a queue rather than a common file is mainly to prepare for automatic monitoring in the next step. In addition, you must call the command "mkfifo/var/log/local6.err" before setting the queue.

  3. Create a script to automatically monitor the target queue

    To receive system errors from the/var/log/local6.err queue to the system administrator in a timely manner, we need to create a script to regularly read the local6.err queue, once a new system error is detected, the system administrator is notified by mail.

    The following is a Perl instance of an automated script for monitoring queues for your reference only.


    Listing 7. Automated scripts for monitoring queues

    # Script monitor_fifo #! /Usr/bin/perl # $ FIFO is the target queue my $ FIFO = "/var/log/local6.err "; # First check the validity of the FIFO instance. Local $ sig {alrm} = sub {die "alarm/N"}; EVAL {ALARM 4; open (pipe, $ FIFO) or die print "error: $ FIFO can not be opened. /n "; Alarm 0 ;}; if ($ @ = ~ /Alarm/) {close pipe; exit 0 ;}# read FIFO my $ allinfo = ""; while (1) {my $ line; EVAL {ALARM 2; $ line =; alarm 0 ;}; if ($ @ = ~ /Alarm/) {close pipe; # notify the system administrator if ($ allinfo) by email if the system error log is read) {my $ command = "Echo/" $ allinfo/"| mail-S/" $ FIFO/"root"; my $ rc = system ($ command ); if ($ RC) {print "notification to root failed. /n ";}} exit 0 ;}$ allinfo. = $ line;} Close pipe;

    It is not difficult to find that this automated script will send all existing logs in the target queue to the system root user in the form of emails, if the system administrator needs to automatically monitor the target queue on a regular basis, the crontab function of the Linux system is required, so that the system can call the monitor_fifo script on a regular basis.

    For example:

    Use the crontab-e command to edit the crontab configuration file and add the following entries:

    */1 *****/root/monitor_fifo 1>/dev/null 2>/dev/null

    The preceding example indicates that the monitor_fifo script is called every minute.

  4. Verify that the configuration is successful

    The administrator can use the logger command to generate several test logs.

    # Logger-P local6.err "This Is A local6.err test message1 ."

    # Logger-P local6.err "This Is A local6.err test message2 ."

    # Logger-P local6.err "This Is A local6.err test message3 ."

    Indicates the system log that generates three messages whose facility is local6 and whose level is err. The message content is this is a local6.err Test message.

    If the configuration is correct, the message will be stored in the target queue/var/log/local6.err. By calling the/root/monitor_fifo script, this system error log is sent to the root user and can be read using the MAIL command.

    Listing 8. Mail commands

      # mail  mailx version nail 11.25 7/29/05.  Type ? for help. "/var/mail/root": 1 message 1 new  >N  1 root@p6hv8n02.clus Tue Apr 14 03:19   21/808   /var/log/monitor.warn  ?  Message  1:  From root@p6hv8n02.clusters.com  Tue Apr 14 03:19:37 2009  X-Original-To: root  Delivered-To: root@p6hv8n02.clusters.com  Date: Tue, 14 Apr 2009 03:19:37 +0000  To: root@p6hv8n02.clusters.com  Subject: /var/log/monitor.warn  User-Agent: nail 11.25 7/29/05  MIME-Version: 1.0  Content-Type: text/plain; charset=us-ascii  Content-Transfer-Encoding: 7bit  From: root@p6hv8n02.clusters.com (root)  Apr 14 03:19:26 p6hv8n02 root: This is a local6.err test message1.  Apr 14 03:19:28 p6hv8n02 root: This is a local6.err test message2.  Apr 14 03:19:30 p6hv8n02 root: This is a local6.err test message3. 

    We can see that through this configuration, the system administrator can easily automatically monitor system errors.

  5. Notes

    In Linux, some security settings will affect the use of syslog/syslog-ng. Therefore, pay special attention to the configuration process. Otherwise, syslog/syslog-ng will not work properly.

    In the RedHat system, The SELinux (Security-enhanced Linux) service must be disabled when syslog/syslog-ng is used. See the following method:

    • Modify the/etc/SELinux/config file and set SELinux = permissive.
    • Restart the system to make SELinux settings take effect.

    In a service-level system of sles10 SP1 and above, to make syslog/syslog-ng work normally, we need to modify/etc/apparmor. d/sbin. syslog-ng file or delete the syslog entry from the apparmor list. The following two examples describe how to modify the sbin. syslog-ng file and delete a SYSLOG entry from the apparmor list:

    1.  

      • Modify the sbin. syslog-ng file, set FIFO/var/log/local6.err to read and write permissions, and restart the boot. apparmor service. As follows:

    Listing 9. Modifying apparmor Access Permissions

      # cat /etc/apparmor.d/sbin.syslog-ng  #include   /sbin/syslog-ng {  #include   .  .  .  /var/run/syslog-ng.pid w,  /var/log/local6.err wr,  }  # /etc/init.d/boot.apparmor restart 
    1.  

      • Delete the syslog entry from the apparmor list and restart the boot. apparmor service. As follows:

    #Rm-F/etc/apparmor. d/sbin. syslogd

    #Rm-F/etc/apparmor. d/sbin. syslog-ng

    # /Etc/init. d/boot. apparmor restart

  6. Cancel Configuration

    Linux Syslog/syslog-ng configuration cancellation is relatively simple, you only need to manually edit/etc/syslog. conf or/etc/syslog-ng/syslog-ng.conf file, delete the related configuration information, and then restart the syslog service.

    In this example, in addition to canceling syslog/syslog-ng configuration, we also need to clear the configuration information in crontab.

Summary

This article describes the log methods of Aix and Linux, AIX Error Log and Linux Syslog, and how to automatically monitor AIX Error Log and Linux Syslog. Monitoring System logs will provide the system administrator with rich information about system operation. The implementation of automatic monitoring can also facilitate the system administrator to take rapid measures in case of system exceptions.

Source of the original article (Click here)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.