Use logs in Linux to troubleshoot
The main cause of log creation is troubleshooting. Usually you will diagnose why the problem occurs in your Linux system or application. Error messages or a series of events can provide you with clues to identify the root cause, illustrate how the problem occurs, and point out how to solve it. Here are a few examples of using logs to solve the problem. Cause of Logon failure if you want to check whether your system is secure, you can check the logon Failure and successful but suspicious users in the authentication log. Authentication fails when someone logs on through invalid or invalid creden, which usually occurs when using SSH for remote login or su to other local users for access. These are recorded by the plug-in verification module (PAM. You can see strings like Failed password and user unknown in your log. The successful authentication record will include strings such as Accepted password and session opened. Example of failure: pam_unix (sshd: auth): authentication failure; logname = uid = 0 euid = 0 tty = ssh ruser = rhost = 10.0.2.2Failed password for invalid user hoover from 10.0.2.2 port 4791 ssh2pam_unix (sshd: auth): check pass; user unknownPAM service (sshd) ignoring max retries; 6> 3 successful example: Accepted password for hoover from 10.0.2.2 port 4792 ssh2pam_unix (sshd: session ): session opened for user hoover by (uid = 0) pam_unix (s Shd: session): session closed for user hoover you can use grep to find which users fail to log on the most frequently. These are potential accounts that attackers are attempting and failing to access. This is an example of ubuntu. $ Grep "invalid user"/var/log/auth. log | cut-d ''-f 10 | sort | uniq-c | sort-nr23 oracle18 ipvs17 nagios10 zabbix6 test because there is no standard format, therefore, you need to use different commands for the logs of each application. The log management system can automatically analyze logs and classify them effectively to help you extract keywords, such as user names. The log management system can use the automatic parsing function to extract user names from Linux logs. This allows you to view User Information and filter information by clicking. In the following example, we can see that the number of logon attempts of the root user is more than 2700, because the filtered logs only show the logon attempts of the root user. The log management system allows you to view the time axis charts, making it easier to detect exceptions. If someone fails to log on once or twice within a few minutes, it may be a real user who forgets the password. However, if hundreds of failed logons are using different user names, it is more likely to be attempting to attack the system. Here, you can see that someone tried to log on to Nagios hundreds of times in March 12. This is obviously not a legal system user. Sometimes, a server is down due to system crash or restart. How do you know when it happened? Who did it? Shutdown command if someone manually runs the shutdown command, you can see it in the verification log file. Here, you can see that someone remotely logged on to ubuntu from the IP address 50.0.134.125, and then shut down the system. Mar 19 18:36:41 ip-172-31-11-231 sshd [23437]: Accepted publickey for ubuntu from 50.0.134.125 port 52538 sshMar 19 18:36:41 ip-172-31-11-231 23437]: sshd [pam_unix (sshd: session ): session opened for user ubuntu by (uid = 0) Mar 19 18:37:09 ip-172-31-11-231 sudo: ubuntu: TTY = pts/1; PWD =/home/ubuntu; USER = root; COMMAND =/sbin/shutdown-r now kernel initialization if you want to see all the reasons for server restart (including crashes), you can find it in the kernel initialization log. You need to search for kernel class and cpu initialization information. Mar 19 18:39:30 ip-172-31-11-231 kernel: [0.000000] Initializing cgroup subsys cpusetMar 19 18:39:30 ip-172-31-11-231 kernel: [0.000000] Initializing cgroup subsys cpuMar 19 18:39:30 ip-172-31-11-231 kernel: [0.000000] Linux version 3.8.0-44-generic (buildd @ tipua) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) #66 ~ Precise1-Ubuntu SMP Tue Jul 15 04:01:04 UTC 2014 (Ubuntu 3.8.0-44.66 ~ Precise1-generic 3.8.13.25) detecting memory problems may cause server crashes for many reasons, but one common cause is that memory is exhausted. When your system has insufficient memory, the process will be killed. Generally, the process that uses the most resources will be killed. An error occurs when the system uses all the memory and new or existing processes attempt to use more memory. Search for a string like Out of Memory or kernel warning information like kill in your log file. This information indicates that the system intentionally kills the process or application, rather than allowing the process to crash. For example: [33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacriice ice child [29923450.995084] select 5230 (docker), adj 0, size 708, to kill you can use tools like grep to find these logs. In ubuntu, $ grep "Out of memory"/var/log/syslog [33238.178288] Out of memory: Kill process 6230 (firefox) score 53 or sacrifice child remember that grep also uses memory, so running grep may also cause memory insufficiency errors. This is another reason why you should store logs centrally! The cron daemon is a scheduler that runs processes on specified dates and times. If the process fails or cannot be completed, the cron error appears in your log file. Depending on your release, you can find this log in/var/log/cron,/var/log/messages, And/var/log/syslog. There are many reasons for cron task failure. Generally, the problem lies in the process rather than the cron daemon itself. By default, cron task output will send an email via postfix. This is a log that shows that the email has been sent. Unfortunately, you cannot see the content of the email here. Mar 13 16:35:01 PSQ110 postfix/pickup [15158]: C3EDC5800B4: uid = 1001 from =