How to analyze Linux logs

Source: Internet
Author: User
Tags learn php syslog apache log loggly rsyslog

There is a lot of information in the log that you need to deal with, although sometimes it's not easy to extract it. In this article, we'll show you some examples of basic log analysis You can do now (you just need to search). We'll also cover some of the more advanced analyses, but these will require you to make the appropriate setup upfront and save a lot of time later on. Examples of advanced analysis of data include the generation of summary counts, filtering of valid values, and so on.

We'll start by showing you how to use several different tools on the command line, and then show how a log management tool can automate most of the heavy lifting to make log analysis easier.

with Grep Search

searching for text is the most basic way to find information. The most common tool for searching text is grep. This command-line tool is available in most Linux distributions and allows you to search for logs with regular expressions. A regular expression is a pattern written in a special language that recognizes matching text. The simplest pattern is to enclose the string you want to look up in quotation marks.

Regular expressions

this is a Ubuntu find in the system's authentication log " User Hoover " Examples of:

$ grep "Userhoover"/var/log/auth.log

Accepted passwordfor Hoover from 10.0.2.2 Port 4792 ssh2

Pam_unix (Sshd:session): Session opened for user Hoover by (uid=0)

Pam_unix (Sshd:session): Session closed for user Hoover

it can be difficult to build precise regular expressions. For example, if we want to search for a number similar to the port "4792" , it may also match timestamps,URLs , and other unwanted data. Ubuntu in the following example, it matches an Apache log that we do not want .

$ grep "4792"/var/log/auth.log

Accepted passwordfor Hoover from 10.0.2.2 Port 4792 ssh2

74.91.21.46--[31/mar/2015:19:44:32 +0000] "get/scripts/samples/search?q=4972http/1.0" 404 545 "-" "-"

Surround Search

Another useful trick is that you can use grep do a surround search. This will show you what a match is before or after a few lines. It can help you debug things that cause errors or problems. the B option shows the previous lines, anda option shows the next few rows. For example, we know that when a person fails to log in as an administrator, and their IP does not resolve backwards, it means they may not have a valid domain name. This is very suspicious!

$ grep-b 3-a 2 ' Invalid user '/var/log/auth.log

APR 17:06:20ip-172-31-11-241 sshd[12545]: Reverse mapping checking getaddrinfo for216-19-2-8.commspeed.net [ 216.19.2.8] Failed-possible break-in attempt!

APR 17:06:20ip-172-31-11-241 sshd[12545]: Received disconnect from 216.19.2.8:11:bye Bye[preauth]

APR 17:06:20ip-172-31-11-241 sshd[12547]: Invalid user admin from 216.19.2.8

APR 17:06:20ip-172-31-11-241 sshd[12547]: Input_userauth_request:invalid user Admin[preauth]

APR 17:06:20ip-172-31-11-241 sshd[12547]: Received disconnect from 216.19.2.8:11:bye Bye[preauth]

Tail

you can also put grep and the Tail use it together to get the last few lines of a file, or track the logs and print them in real time. This is useful when you make interactive changes, such as starting the server or testing the code changes.

$ Tail-f/var/log/auth.log | grep ' Invalid user '

APR 19:49:48ip-172-31-11-241 sshd[6512]: Invalid user ubnt from 219.140.64.136

APR 19:49:49ip-172-31-11-241 sshd[6514]: Invalid user admin from 219.140.64.136

about the grep and regular expressions are not covered in this guide, but Ryan ' S Tutorials have a more in-depth introduction.

The log management system has higher performance and more powerful search capabilities. They typically index data and query in parallel, so you can quickly search for gigabytes or terabytes of logs in seconds. In contrast,grep takes a few minutes, and in extreme cases it may even be hours. The log management system also uses the Lucene -like query language, which provides a simpler syntax for retrieving numbers, fields, and others.

with Cut , AWK , and the Grok parsing

Command-line tools

Linux multiple command-line tools are available for text parsing and parsing. Useful when you want to quickly parse small amounts of data, but it can take a long time to process large amounts of data.

Cut

Cut the command allows you to parse a field from a delimited log. Delimiters are equals or commas that can separate fields or key-value pairs.

Let's say we want to resolve the user from the following log:

Pam_unix (Su:auth): Authentication failure; Logname=hoover uid=1000 euid=0 tty=/dev/pts/0ruser=hoover rhost= user=root

we can use it as follows Cut command gets the text of the eighth field after the equal sign is split. This is an Example of an Ubuntu system:

$ grep "Authentication Failure"/var/log/auth.log | cut-d ' = '-F 8

Root

Hoover

Root

Nagios

Nagios

Awk

Alternatively, you can also use awk , it provides a more powerful parsing field capability. It provides a scripting language that you can filter out almost anything irrelevant.

For example, suppose that the Ubuntu we have the following line of logs in the system and we want to extract the user name that failed the login:

Mar 08:28:18ip-172-31-11-241 sshd[32701]: Input_userauth_request:invalid user Guest[preauth]

you can use it as follows awk command. First, use a regular expression /sshd.*invalid user/ to match the sshd Invalid user row. the Nineth field is then printed with {print $9} based on the default delimiter space. This will output the user name.

$ Awk '/sshd.*invalid user/{print $9} '/var/log/auth.log

Guest

Admin

Info

Test

Ubnt

you can be in Awk User Guide read more information about how to use regular expressions and output fields.

Log Management System

The log management system makes parsing easier, allowing users to quickly analyze many log files. They can automatically parse standard log formats, such as common Linux logs and Web server logs. This can save a lot of time because you don't have to think about writing parsing logic when dealing with system problems.

below is a sshd RemoteHost and user Loggly

You can also customize parsing for non-standard formats. A common tool is Grok, which uses a common regular expression library to parse the original text into structured JSON. The following is a case configuration in which Grok resolves the kernel log files in Logstash :

filter{

Grok {

Match + = {"Message" = "%{ciscotimestamp:timestamp}%{host:host}%{word:program}%{notspace}%{notspace}%{number :d uration}%{notspace}%{greedydata:kernel_logs} "

}

}

with Rsyslog and the AWK Filter

Filtering allows you to retrieve a specific field value instead of a full-text search. This makes your log analysis more accurate because it ignores matches that are not needed from other parts of the log information. In order to search for a field value, you first need to parse the log or at least have a way to retrieve the event structure.

How to filter your app

Typically, you may want to see only one app's log. It's easy if your app saves records to a file. It is more complicated if you need to filter an application in a clustered or centralized log. Here are a few ways to accomplish this:

with Rsyslog daemon parsing and filtering logs. The following example writes the log for the sshd application to a file named sshd-message , and then discards the event so that it does not recur elsewhere. You can test this example by adding it to your rsyslog.conf file.

:p rogramname,isequal, "sshd"/var/log/sshd-messages

&~

with similar awk command-line tool to extract values for a specific field, such as sshd user name. Here is An example of an Ubuntu system.

$ Awk '/sshd.*invalid user/{print $9} '/var/log/auth.log

Guest

Admin

Info

Test

Ubnt

Use the Log management system to automatically parse the log, then click Filter on the desired app name. The following is the extraction of the syslog domain in the Loggly Log Management Service . We filter the application name "sshd" ,

How to filter errors

a person most want to see the error in the log. Unfortunately, the default syslog configuration does not directly output the severity of the error, making it difficult to filter them.

here are two ways to solve this problem. First, you can modify your rsyslog configuration to output the error severity in the log file, making it easy to view and retrieve. In your rsyslog configuration You can Add a template with Pri-text, like this:

"<%PRI-TEXT%>:%timegenerated%,%hostname%,%syslogtag%,%msg%n"

This example is output in the following format. You can see the err that indicates the error in this message .

<authpriv.err>: Mar 18:18:00,hoover-virtualbox,su[5026]:, pam_authenticate:authenticationfailure

You can use awk or grep retrieves the error message. In Ubuntu , for this example, we can use some grammatical features, for example . and the > , they will only match this domain.

$ grep ' .err> '/var/log/auth.log

<authpriv.err>: Mar 18:18:00,hoover-virtualbox,su[5026]:, pam_authenticate:authenticationfailure

Your second option is to use a log management system. A good log management system can automatically parse syslog messages and extract error domains. They also allow you to filter specific errors in log messages with a simple click.

shows the severity of the highlight error . syslog field, which means we are filtering the error .

Free pick upBrother LianITEducationOriginalLinuxMaintenance EngineerVideo/elaborateLinuxtutorials, more information on the official website customer service:http://www.lampbrother.net/linux/

learn PHP, Linux, HTML5, UI, Android and other video tutorials (Courseware + notes + video)! Contact Q2430675018

Welcome to join Linux Communication Group Group number: 478068715


How to analyze Linux logs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.