Use logs in Linux to troubleshoot errors.

Last Update:2015-09-12 Source: Internet

Author: User

Tags echo command loggly

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use logs in Linux to troubleshoot errors.

Source: logugly Source: LCTT welcome to share the original article with bole headlines

The main cause of log creation is troubleshooting. Usually you will diagnose why the problem occurs in your Linux system or application. Error messages or a series of events can provide you with clues to identify the root cause, illustrate how the problem occurs, and point out how to solve it. Here are a few examples of using logs to solve the problem.

Cause of Logon Failure

If you want to check whether your system is secure, you can check the logon Failure and successful but suspicious users in the authentication log. Authentication fails when someone logs on through invalid or invalid creden, which usually occurs when using SSH for remote login or su to other local users for access. These are recorded by the plug-in verification module (PAM. You can see strings like Failed password and user unknown in your log. The successful authentication record will include strings such as Accepted password and session opened.

Example of failure:

1234	`pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.0.2.2Failed password` `for` `invalid user hoover from 10.0.2.2 port 4791 ssh2pam_unix(sshd:auth): check pass; user unknownPAM service(sshd) ignoring max retries; 6 > 3`

Successful example:

123	`Accepted password` `for` `hoover from 10.0.2.2 port 4792 ssh2pam_unix(sshd:session): session opened` `for` `user hoover by (uid=0)pam_unix(sshd:session): session closed` `for` `user hoover`

You can use grep to find the users with the most logon failures. These are potential accounts that attackers are attempting and failing to access. This is an example of ubuntu.

123456 $ grep "invalid user" /var/log/auth.log | cut -d ' ' -f 10 | sort | uniq -c | sort -nr23 oracle18 postgres17 nagios10 zabbix6 test

Because there is no standard format, you need to use different commands for the logs of each application. The log management system can automatically analyze logs and classify them effectively to help you extract keywords, such as user names.

The log management system can use the automatic parsing function to extract user names from Linux logs. This allows you to view User Information and filter information by clicking. In the following example, we can see that the number of logon attempts of the root user is more than 2700, because the filtered logs only show the logon attempts of the root user.

The log management system allows you to view the time axis charts, making it easier to detect exceptions. If someone fails to log on once or twice within a few minutes, it may be a real user who forgets the password. However, if hundreds of failed logons are using different user names, it is more likely to be attempting to attack the system. Here, you can see that someone tried to log on to Nagios hundreds of times in March 12. This is obviously not a legal system user.

Cause of Restart

Sometimes, a server goes down due to a system crash or restart. How do you know when it happened? Who did it?

Shutdown command
If someone runs the shutdown command manually, you can see it in the verification log file. Here, you can see that someone remotely logged on to ubuntu from the IP address 50.0.134.125, and then shut down the system.

123 Mar 19 18:36:41 ip-172-31-11-231 sshd[23437]: Accepted publickey for ubuntu from 50.0.134.125 port 52538 sshMar 19 18:36:41 ip-172-31-11-231 23437]:sshd[ pam_unix(sshd:session): session opened for user ubuntu by (uid=0)Mar 19 18:37:09 ip-172-31-11-231 sudo: ubuntu : TTY=pts/1 ; PWD=/home/ubuntu ; USER=root ; COMMAND=/sbin/shutdown -r now

Kernel Initialization

If you want to see all the causes of server restart (including crashes), you can find them in the kernel initialization log. You need to search for kernel class and cpu initialization information.

123 Mar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Initializing cgroup subsys cpusetMar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Initializing cgroup subsys cpuMar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Linux version 3.8.0-44-generic (buildd@tipua) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 (Ubuntu 3.8.0-44.66~precise1-generic 3.8.13.25)

Detect memory problems

The server may crash for many reasons, but one common cause is that the memory is exhausted.

When your system has insufficient memory, the process will be killed. Generally, the process that uses the most resources will be killed. An error occurs when the system uses all the memory and new or existing processes attempt to use more memory. Search for a string like Out of Memory or kernel warning information like kill in your log file. This information indicates that the system intentionally kills the process or application, rather than allowing the process to crash.

For example:

12	`[33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacrifice child[29923450.995084]` `select` `5230 (docker), adj 0, size 708, to` `kill`

You can use tools like grep to find these logs. This example is in ubuntu:

12	`$` `grep` `“Out of memory”` `/var/log/syslog` `[33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacrifice child`

Remember that grep also uses memory, so running grep may also cause memory insufficiency errors. This is another reason why you should store logs centrally!

Scheduled task Error Log

The cron daemon is a scheduler that can run processes on specified dates and times. If the process fails or cannot be completed, the cron error appears in your log file. Depending on your release, you can find this log in/var/log/cron,/var/log/messages, And/var/log/syslog. There are many reasons for cron task failure. Generally, the problem lies in the process rather than the cron daemon itself.

By default, cron task output will send an email via postfix. This is a log that shows that the email has been sent. Unfortunately, you cannot see the content of the email here.

1234 Mar 13 16:35:01 PSQ110 postfix/pickup[15158]: C3EDC5800B4: uid=1001 from=Mar 13 16:35:01 PSQ110 postfix/cleanup[15727]: C3EDC5800B4: message-id=<20150310110501.C3EDC5800B4@PSQ110>Mar 13 16:35:01 PSQ110 postfix/qmgr[15159]: C3EDC5800B4: from=Mar 13 16:35:05 PSQ110 postfix/smtp[15729]: C3EDC5800B4: to=in.l.google.com[74.125.130.26]:25, delay=4.1, delays=0.26/0/2.2/1.7, dsn=2.0.0, status=sent (250 2.0.0 OK 1425985505 f16si501651pdj.5 - gsmtp)

You can consider recording the cron standard output to the log to help you locate the problem. This is an example of how you use the logger command to redirect the cron standard output to syslog. Replace the echo command with your script. helloCron can be set to the name of any application you want.

1	`/5` ` * * *` `echo` `‘Hello World’ 2>&1 \|` `/usr/bin/logger` `-t helloCron`

The log entries it creates:

12	`Apr 28 22:20:01 ip-172-31-11-231 CRON[15296]: (ubuntu) CMD (echo` `'Hello World!'` `2>&1 \|` `/usr/bin/logger` `-t helloCron)Apr 28 22:20:01 ip-172-31-11-231 helloCron: Hello World!`

Each cron task records different logs based on the task type and how data is output.

You can add additional log records as needed.

Do coders have to work overtime? NO!

I know that coders all want to get rid of the title of overtime dogs and takeout faces, so we are here!

We created an APP that allows programmers to share knowledge and skills, and thought it could subvert the programmer's staff.
Type!

Some people say we are crazy, but we don't think so.

In order to break down the faces of people who are infatuated with us, we are in urgent need of coders to give
Our "number pulse "! "Medical fees" are generous! After all, we just want to do our best!

As mentioned in the circle dictionary, niuba-people refer to QQ Group Owners or followers who have more than 1000 members in the group.
The number of webmasters or moderators with 2000 or more followers is higher than 10000.
Or the reading volume of a single post is higher than that of 2000 bloggers or celebrities with a wide network of contacts.

For the future leaders who fail to reach the standard, we can only say with tears: Shu, we are here for a long time.
Do you have an appointment? Wait for the day when you become a god, and I will survive and die!

Come? Or not?

Phone number of the lap interactive joint: 1955246408 (QQ)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More