How to analyze Linux logs

Source: Internet
Author: User
Tags learn php syslog apache log loggly rsyslog

There is a lot of information in the log that you need to deal with, although sometimes it's not easy to extract it. In this article, we'll show you some examples of basic log analysis You can do now (you just need to search). We'll also cover some of the more advanced analyses, but these will require you to make the appropriate setup upfront and save a lot of time later on. Examples of advanced analysis of data include the generation of summary counts, filtering of valid values, and so on.

We'll start by showing you how to use several different tools on the command line, and then show how a log management tool can automate most of the heavy lifting to make log analysis easier.

Search with Grep

Searching for text is the most basic way to find information. The most common tool for searching text is grep. This command-line tool is available in most Linux distributions and allows you to search for logs with regular expressions. A regular expression is a pattern written in a special language that recognizes matching text. The simplest pattern is to enclose the string you want to look up in quotation marks.

Regular expressions

This is an example of finding "user Hoover" in the authentication log of the Ubuntu system:

$ grep "User Hoover"/var/log/auth.log

Accepted password for Hoover from 10.0.2.2 Port 4792 ssh2

Pam_unix (Sshd:session): Session opened for user Hoover by (uid=0)

Pam_unix (Sshd:session): Session closed for user Hoover

It can be difficult to build precise regular expressions. For example, if we want to search for a number similar to port "4792", it may also match timestamps, URLs, and other unwanted data. Ubuntu in the following example, it matches an Apache log that we do not want.

$ grep "4792"/var/log/auth.log

Accepted password for Hoover from 10.0.2.2 Port 4792 ssh2

74.91.21.46--[31/mar/2015:19:44:32 +0000] "get/scripts/samples/search?q=4972 http/1.0" 404 545 "-" "-"

Surround Search

Another useful trick is that you can use grep to do the surround search. This will show you what a match is before or after a few lines. It can help you debug things that cause errors or problems. The B option shows the previous lines, and a option shows the next few rows. For example, we know that when a person fails to log in as an administrator, and their IP does not resolve backwards, it means they may not have a valid domain name. This is very suspicious!

$ grep-b 3-a 2 ' Invalid user '/var/log/auth.log

APR 17:06:20 ip-172-31-11-241 sshd[12545]: Reverse mapping checking getaddrinfo for 216-19-2-8.commspeed.net [216.19.2 .8] Failed-possible break-in attempt!

APR 17:06:20 ip-172-31-11-241 sshd[12545]: Received disconnect from 216.19.2.8:11:bye Bye [PreAuth]

APR 17:06:20 ip-172-31-11-241 sshd[12547]: Invalid user admin from 216.19.2.8

APR 17:06:20 ip-172-31-11-241 sshd[12547]: input_userauth_request:invalid user admin [PreAuth]

APR 17:06:20 ip-172-31-11-241 sshd[12547]: Received disconnect from 216.19.2.8:11:bye Bye [PreAuth]

Tail

You can also use grep with tail to get the last few lines of a file, or track the logs and print them in real time. This is useful when you make interactive changes, such as starting the server or testing the code changes.

$ tail-f/var/log/auth.log | grep ' Invalid user '

APR 19:49:48 ip-172-31-11-241 sshd[6512]: Invalid user ubnt from 219.140.64.136

APR 19:49:49 ip-172-31-11-241 sshd[6514]: Invalid user admin from 219.140.64.136

A detailed description of grep and regular expressions is not covered in this guide, but Ryan's tutorials has a more in-depth introduction.

The log management system has higher performance and more powerful search capabilities. They typically index data and query in parallel, so you can quickly search for gigabytes or terabytes of logs in seconds. In contrast, grep takes a few minutes, and in extreme cases it may even be hours. The log management system also uses the Lucene-like query language, which provides a simpler syntax for retrieving numbers, fields, and others.

Parsing with Cut, AWK, and Grok

Command-line tools

Linux provides multiple command-line tools for text parsing and parsing. Useful when you want to quickly parse small amounts of data, but it can take a long time to process large amounts of data.

Cut

The Cut command allows you to parse a field from a delimited log. Delimiters are equals or commas that can separate fields or key-value pairs.

Let's say we want to resolve the user from the following log:

Pam_unix (Su:auth): Authentication failure; Logname=hoover uid=1000 euid=0 tty=/dev/pts/0 ruser=hoover rhost= user=root

We can use the Cut command as follows to get the text of the eighth field after the equal sign is split. This is an example of an Ubuntu system:

$ grep "Authentication Failure"/var/log/auth.log | cut-d ' = '-F 8

Root

Hoover

Root

Nagios

Nagios

Awk

Alternatively, you can use AWK, which provides a more powerful parsing field capability. It provides a scripting language that you can filter out almost anything irrelevant.

For example, suppose we have the following line of logs in the Ubuntu system and we want to extract the user name that failed the login:

Mar 08:28:18 ip-172-31-11-241 sshd[32701]: input_userauth_request:invalid user Guest [PreAuth]

You can use the awk command as follows. First, use a regular expression/sshd.*invalid user/to match the sshd invalid user row. The Nineth field is then printed with {print $9} based on the default delimiter space. This will output the user name.

$ Awk '/sshd.*invalid user/{print $9} '/var/log/auth.log

Guest

Admin

Info

Test

Ubnt

You can read more about how to use regular expressions and output fields in the Awk User Guide.

Log Management System

The log management system makes parsing easier, allowing users to quickly analyze many log files. They can automatically parse standard log formats, such as common Linux logs and Web server logs. This can save a lot of time because you don't have to think about writing parsing logic when dealing with system problems.

The following is an example of an sshd log message that resolves each remotehost and user. This is one of the Loggly, which is a cloud-based log management service.

You can also customize parsing for non-standard formats. A common tool is Grok, which uses a common regular expression library to parse the original text into structured JSON. The following is a case configuration in which Grok resolves the kernel log files in Logstash:

filter{

Grok {

Match + = {"Message" = "%{ciscotimestamp:timestamp}%{host:host}%{word:program}%{notspace}%{notspace}%{number :d uration}%{notspace}%{greedydata:kernel_logs} "

}

}

Filtering with Rsyslog and AWK

Filtering allows you to retrieve a specific field value instead of a full-text search. This makes your log analysis more accurate because it ignores matches that are not needed from other parts of the log information. In order to search for a field value, you first need to parse the log or at least have a way to retrieve the event structure.

How to filter your app

Typically, you may want to see only one app's log. It's easy if your app saves records to a file. It is more complicated if you need to filter an application in a clustered or centralized log. Here are a few ways to accomplish this:

Use the Rsyslog daemon to parse and filter logs. The following example writes the log for the sshd application to a file named Sshd-message, and then discards the event so that it does not recur elsewhere. You can test this example by adding it to your rsyslog.conf file.

:p Rogramname, IsEqual, "sshd"/var/log/sshd-messages

&~

Extracts the value of a specific field, such as an sshd user name, using a command-line tool like awk. Here is an example of an Ubuntu system.

$ Awk '/sshd.*invalid user/{print $9} '/var/log/auth.log

Guest

Admin

Info

Test

Ubnt

Use the log management system to automatically parse the log, then click Filter on the desired app name. The following is the extraction of the syslog domain in the Loggly Log Management service. We filter the application name "Sshd",

How to filter errors

One person most want to see the error in the log. Unfortunately, the default syslog configuration does not directly output the severity of the error, making it difficult to filter them.

Here are two ways to solve this problem. First, you can modify your rsyslog configuration to output the error severity in the log file, making it easy to view and retrieve. In your rsyslog configuration you can add a template with Pri-text, like this:

"<%pri-text%>:%timegenerated%,%hostname%,%syslogtag%,%msg%n"

This example is output in the following format. You can see the err that indicates the error in this message.

<authpriv.err>: Mar 18:18:00,hoover-virtualbox,su[5026]:, pam_authenticate:authentication failure

You can use awk or grep to retrieve the error message. In Ubuntu, for this example, we can use some grammatical features, for example. And, they will only match this field.

$ grep ' .err> '/var/log/auth.log

<authpriv.err>: Mar 18:18:00,hoover-virtualbox,su[5026]:, pam_authenticate:authentication failure

Your second option is to use a log management system. A good log management system can automatically parse Syslog messages and extract error domains. They also allow you to filter specific errors in log messages with a simple click.

A syslog field showing the severity of the highlighted error indicates that we are filtering the error.

Free pick up brother even it education original Linux OPS engineer video/Detailed Linux tutorials, details Inquiry official website customer Service: http://www.itxdl.cn/linux/

Learn PHP, Linux, HTML5, UI, Android and other video tutorials (Courseware + notes + video)! Contact Q2430675018

Welcome to the Linux Exchange Group number: 478068715

How to analyze Linux logs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.