The log analysis after a server crash is explained here. The report written by NND suddenly disappears. Alas, is it the reason why I watched the video yesterday, 360 or something is really unreliable. Forget it. According to the/va/log/message analysis, the system can still provide services normally before December 31, October 11. If the system is down due to a service, other logs will be generated. At 19:44:20 on December 31, October 11, the system starts a long log. Preliminary judgment may be due to hardware or other reasons of the system. After the system is restarted, the business can still run normally, indicating that other services or configurations should not be wrong. Check the login log to exclude intrusion and human factors, leading to system downtime, and determine whether the system is hardware or other reasons of the system, however, no error is reported on the server indicator, and the hard disk memory power supply Board is working properly. Oct 6 04:03:02 epmttetla syslogd 1.4.1: restart. oct 11:44:20 epmttetla syslogd 1.4.1: restart. oct 11:44:20 epmttetla kernel: klogd 1.4.1, log source =/proc/kmsg started. oct 11:44:20 epmttetla kernel: Linux version 2.6.18-164. el5 (mockbuild@x86-003.build.bos.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46) #1 SMP Tue Aug 18 15:51:48 EDT2009Oct 11 19: 44: 20 epmttetla kernel: Command lin E: ro root = LABEL =/rhgb quietOct 11 19: 44: 20 epmttetla kernel: BIOS-provided physical RAM map: Oct11 19:44:20 epmttetla kernel: PNP: No PS/2 controller found. probing portsdirectly. It is initially determined that the kernel is not fully compatible with the dual-core support, or other problems with the system kernel, but it should not cause system downtime. These errors are not fatal errors, it is not an error that can be detected by the core itself. That is to say, when the core initializes itself, it goes down inexplicably. It may be the relationship between the hardware. A feasible test method is as follows: 1. Determine the battery of the motherboard, and then press it to ensure that the system starts normally. 2. If Option 1 fails, replace each hardware one by one, depending on the situation. 3. If neither 1 nor 2 works, you need to determine whether the server is down due to errors reported by other services. The above is based on analysis by others on the Internet. I personally think that the system kernel may be a cause of an error. Check all the logs to determine that the system was not launched for a long time and has insufficient permissions, you cannot get more things to analyze whether the service is stopped due to system services, resulting in system downtime. Oct 11 19:44:20 epmttetlakernel: usbcore: registered new driver hidct CT 11 19: 44: 20 epmttetla kernel: usbcore: registered new driver usbhidOct 11 19: 44: 20 epmttetla kernel: drivers/usb/input/hid-core.c: v2.6: usb hid core driverOct 11:44:20 epmttetla kernel: PNP: No PS/2 controller found. probing portsdirectly. oct 11 19: 44: 20 epmttetla kernel: Failed to disable AUX port, but continuing anyway... I S thisa SiS? Oct 11 19: 44: 20 epmttetla kernel: If AUX port is really absent please use the 'i8042. noaux 'option. oct 11 19: 44: 20 epmttetla kernel: serio: i8042 KBD port at 0x60, 0x64 irq 1Oct 11 19: 44: 20 epmttetla kernel: mice: PS/2 mouse device common for all miceOct 11 19: 44: 20 epmttetla kernel: md driver 0.90.3 MAX_MD_DEVS = 256, MD_SB_DISKS = 27Oct 11 19:44:20 epmttetlakernel: md: bitmap version 4.39Oct 11 :44:2 0 epmttetla kernel: the TCP bic registered hard disk smart reports an error. It may be that when the business traffic is too large, the hard disk read/write speed is too fast, leading to system downtime. However, the hard disk is used more than its own life cycle. After the server is started, the hard disk indicator does not trigger an alarm. However, if the hard disk smart reports an error, the server indicator does not trigger an alarm. We recommend that you back up the hard disk, one by one hard disk replacement ensures normal service provision and prevents data loss. In fact, this is only possible. There are only two hard disks on the server. It is estimated that RAID1 is implemented, and tomcat and mysql are run on it. Will the hard disk be suspended when the business traffic is too large? The server is running normally now, and the backup disk should also be available in the warehouse. Pray, don't lose data. Oct 11 19: 47: 27 smartd version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce AllenOct 11 19: 47: 27 epmttetla smartd [6384]: Home page ishttp: // smartmontools.sourceforge.net/Oct 11 19: 47: 27 epmttetla smartd [6384]: Opened configuration file/etc/smartd. confOct 11 19: 47: 27 epmttetla smartd [6384]: Configuration file/etc/smartd. confwas parsed, foundDEVICESCAN, scanning devicesOct 11 19: 47: 27 epmttetla smartd [6384]: Problem creating device name scan listOct 11 19: 47: 27 epmttetla smartd [6384]: Device: /dev/sda, openedOct 11:47:27 epmttetla smartd [6384]: Device:/dev/sda, IE (SMART) not enabled, skipdevice Try 'smartctl-s on/dev/sda' to turn on SMART featuresOct: 27 epmttetla smartd [6384]: Monitoring 0 ATA and 0 SCSI devicesOct: 27 epmttetla smartd [6386]: smartd has fork () ed into background mode. newPID = 6386.Oct 11 19: 47: 28 epmttetla avahi-daemon [6324]: Server startup complete. host name isepmttetla. local. local service cookie is 255.669660.oct 11 19: 47: 29 epmttetla avahi-daemon [6324]: Service "SFTP File Transfer onepmttetla" (/services/sftp-ssh.service) successfully established. oct 11 19: 47: 30 epmttetla kernel: mtrr: type mismatch for f9000000, 400000 old: write-back new: write-combiningOct 11 19: 47: 30 epmttetla kernel: mtrr: type mismatch for f9000000, 1000000 old: write-back new: write-combiningOct 11 19: 47: 31 epmttetla pcscd: winscard. c: 304: SCardConnect () Reader E-Gate 0 0 Not FoundOct 11 19: 47: 31 epmttetla last message repeated 3 timesOct 12 21: 40: 01 epmttetla auditd [5661]: audit daemon rotating log files