Use logs for troubleshooting in Linux

Last Update:2015-09-11 Source: Internet

Author: User

Tags auth echo command syslog

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Source: loggly Translation Source: LCTT Welcome to share the original to Bole headlines

The main reason people create logs is the wrong line. Usually you will diagnose what problems occur in your Linux system or application. An error message or a series of events can give you clues to find the root cause, explain how the problem occurred, and point out how to fix it. Here are a few examples of using logs to solve.

Logon Failure Reason

If you want to check if your system is secure, you can check the authentication log for failed logins and login successes but suspicious users. Authentication failures occur when someone logs on with improper or invalid credentials, which typically occurs when using SSH for remote logins or SU to other local users for access. These are recorded by the plug-in verification module (PAM). You'll see strings like Failed Password and user unknown in your log. A successful authentication record would include strings such as Accepted Password and session opened.

Examples of failures:

1234	`pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.0.2.2Failed password` `forinvalid user hoover from 10.0.2.2 port 4791 ssh2pam_unix(sshd:auth): check pass; user unknownPAM service(sshd) ignoring max retries; 6 > 3`

Examples of success:

123	`Accepted password` `forhoover from 10.0.2.2 port 4792 ssh2pam_unix(sshd:session): session opened` `for` `user hoover by (uid=0)pam_unix(sshd:session): session closed` `foruser hoover`

You can use grep to find out which users have the most number of failed logins. These are potential attackers who are trying and accessing failed accounts. This is an example of an Ubuntu system.

123456 $ grep "invalid user"/var/log/auth.log | cut -d ‘ ‘-f 10 | sort | uniq -c | sort -nr23 oracle18 postgres17 nagios10 zabbix6 test

Because there is no standard format, you need to use different commands for each application's log. Log management system, you can automatically analyze the log, and effectively categorize them to help you extract keywords, such as user name.

The log management system can use the automatic parsing feature to extract user names from the Linux logs. This allows you to see the user's information and be able to filter by clicking. In the following example, we can see that the root user logged in more than 2,700 times because our filtered logs only show the root user's attempt to log in.

The log management system also allows you to view the chart with time as an axis, making it easier to spot anomalies. If someone fails to log on once or two times within a few minutes, it may be a real user and forget the password. However, if you have hundreds of failed logins and are using a different user name, it is more likely that you are trying to attack the system. Here, you can see that on March 12, someone tried to log in to Nagios hundreds of times. Is this obvious?? is not a legitimate system user.

Reason for restart

Sometimes, a server goes down because of a system crash or reboot. How do you know when it happened and who did it?

Shutdown command
If someone runs the shutdown command manually, you can see it in the validation log file. Here, you can see that someone has telnet from the IP 50.0.134.125 as an Ubuntu user and then shuts down the system.

123 Mar 19 18:36:41 ip-172-31-11-231 sshd[23437]: Accepted publickey forubuntu from 50.0.134.125 port 52538 sshMar 19 18:36:41 ip-172-31-11-231 23437]:sshd[ pam_unix(sshd:session): session opened for user ubuntu by (uid=0)Mar 19 18:37:09 ip-172-31-11-231 sudo: ubuntu : TTY=pts/1 ; PWD=/home/ubuntu ; USER=root ; COMMAND=/sbin/shutdown-r now

Kernel initialization

If you want to see all the causes of server restarts (including crashes), you can look for them from the kernel initialization log. You need to search for kernel class (kernel) and CPU initialization (Initializing) information.

123 Mar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Initializing cgroup subsys cpusetMar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Initializing cgroup subsys cpuMar 19 18:39:30 ip-172-31-11-231 kernel: [ 0.000000] Linux version 3.8.0-44-generic ([email protected]) (gcc version 4.6.3 (Ubuntu/Linaro4.6.3-1ubuntu5) ) #66~precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 (Ubuntu 3.8.0-44.66~precise1-generic 3.8.13.25)

Detecting Memory problems

There are a number of reasons for a server crash, but a common cause is memory exhaustion.

When your system is running out of memory, the process is killed and the process that uses the most resources is usually killed. An error occurs when the system uses all memory and the new or existing process tries to use more memory. Look for a string such as out of Memory in your log file or a kernel warning message like kill. This information indicates that the system intentionally kills the process or application, rather than allowing the process to crash.

For example:

12	`[33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacrifice child[29923450.995084]` `select5230 (docker), adj 0, size 708, to` `kill`

You can use tools like grep to find these logs. This example is in Ubuntu:

12	`$` `grep“Out of memory”` `/var/log/syslog` `[33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacrifice child`

Keep in mind that grep also uses memory, so just running grep can also lead to out-of-memory errors. This is another reason why you should centrally store logs!

Timed Task error Log

The cron daemon is a scheduler that can run a process at a specified date and time. If the process fails or does not complete, a cron error appears in your log file. Depending on your release version, you can find this log in/var/log/cron,/var/log/messages, and/var/log/syslog several locations. There are many reasons for cron task failure. Typically, the problem occurs in the process rather than the cron daemon itself.

By default, the output of the cron task sends an e-mail message through Postfix. This is a log that shows that the message has been sent. Unfortunately, you can't see the contents of the message here.

1234 Mar 13 16:35:01 PSQ110 postfix/pickup[15158]: C3EDC5800B4: uid=1001 from=Mar 13 16:35:01 PSQ110 postfix/cleanup[15727]: C3EDC5800B4: message-id=<[email protected]>Mar 13 16:35:01 PSQ110 postfix/qmgr[15159]: C3EDC5800B4: from=<[email protected]>, size=607, nrcpt=1 (queue active)Mar 13 16:35:05 PSQ110 postfix/smtp[15729]: C3EDC5800B4: to=<[email protected]>, relay=gmail-smtp-in.l.google.com[74.125.130.26]:25, delay=4.1, delays=0.26/0/2.2/1.7, dsn=2.0.0, status=sent (250 2.0.0 OK 1425985505 f16si501651pdj.5 - gsmtp)

You might consider logging the standard output of cron to a log to help you locate the problem. This is an example of how you can use the Logger command to redirect the Cron standard output to the syslog. Using your script instead of the echo command, Hellocron can be set to the name of any application you want.

1	`/5 * * *` `echo` `‘Hello World’ 2>&1 \|` `/usr/bin/logger-t helloCron`

It creates a log entry:

12	`Apr 28 22:20:01 ip-172-31-11-231 CRON[15296]: (ubuntu) CMD (echo‘Hello World!‘` `2>&1 \|` `/usr/bin/logger-t helloCron)Apr 28 22:20:01 ip-172-31-11-231 helloCron: Hello World!`

Each cron task records different logs based on the specific type of task and how the data is output.

You may want to have a clue about the source of the problem in the log, or you can add additional log records as needed.

　The code farmer has to work overtime? No!

Know yards farmers want to get rid of overtime dog, takeout face title, so we came!

We did an app that allows programmers to share knowledge and skills and feel that they can subvert the programmer's work.
Expression

Some people say we are wishful thinking, but we don't think so.

In order to be able to fan the face of our wishful thinking people, now we urgently need the programmer's industry of the bull beep-characters to give
We "pulse"! "The diagnostic fee" is generous! After all, we are not poor money, just want to do the best!

In the Circle dictionary, the New-character refers to the group of QQ Group of more than 1000 people or the number of followers more than
Stick Bar for 2000 people the number of micro-bobo master or member with more than 10000 people or more than 2000 topic stickers Moderator
Or a single-post reading is more than 2000 bloggers or people in the super-wide circle of celebrities.

For the future of the failure to meet the great gods, we can only say in tears: sorghum, we more days, this temporary
Don't you want to date? When you become God in the future, I will dependency!

To? Or not?

Circle Interactive Connector Code: 1955246408 (QQ)

Use logs for troubleshooting in Linux

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More