Linux server hardware error, System abnormal restart detection-mcelog

Source: Internet
Author: User
Tags system log server memory

Mcelog is a tool used on x86 Linux systems to check for hardware errors, especially memory and CPU errors.
For example, the server is inexplicably restarted over time, and message and syslog do not detect valuable information.

The cause of the MCE error usually occurs as follows:
1, memory error or ECC problem
2. Processor Overheating
3. System bus Error
4, CPU or hardware cache error


In general, when there are error prompts, you need to pay attention to memory problems, but because the memory controller is now integrated in the CPU, so there are individual cases caused by CPU problems


One, if it is networked, the Yum source configuration is available
Yum Install Mcelog

And then run

Service MCELOGD Start

Mcelog--daemon
View Log Mode
/var/log/mcelog

The failure restart log is as follows:
MCE 0
HARDWARE ERROR. This is a software problem!
Please contact your hardware vendor
CPU 1 BANK 8 TSC 1193fd60c6699 [at 1 Mhz 18:56:49 uptime (unreliable)]
MISC 8f44960800095840 ADDR 4a9f3b1c0
MCG Status:
MCi Status:
Error Overflow
Mci_misc Register Valid
MCI_ADDR Register Valid
Mca:memory CONTROLLER Rd_channelunspecified_err
Transaction:memory Read error
Memory Read ECC Error
Memory corrected error count (core_err_cnt): 18
Memory transaction Tracker ID (rtid): 40
Memory DIMM ID of Error:1
Memory Channel ID of error:0
Memory ECC syndrome:f449608
STATUS cc0004800001009f Mcgstatus 0

Second, local installation
RPM-IVH mcelog-109-4.0fc9f70.el6.x86_64.rpm #RPM见附件
Service MCELOGD Start
Mcelog


Mcelog Related Documents
/dev/mcelog Device files
/var/log/mcelog Messages Log File
/etc/mcelog/mcelog.conf configuration file
/var/run/mcelog.pid

The default fault log is recorded only in/var/log/mcelog and is not logged in the system log.
If required in the system log also reflected in the need to modify the/etc/mcelog/mcelog.conf file, the previous # removed, and saved.

Mcelog Related Settings
1.mcelog boot with the system, see the config file under boot, you can see the MCE module randomly started
2. Configure Mcelog background Run
#mcelog--daemon
3. View Mcelo

Because each manufacturer's server memory and CPU slot design may be different, the location may not be allowed



650) this.width=650; "src="/e/u261/themes/default/images/spacer.gif "style=" Background:url ("/e/u261/lang/zh-cn/ Images/localimage.png ") no-repeat center;border:1px solid #ddd;" alt= "Spacer.gif"/>




























This article is from the "Heavenly Soul Eternal" blog, please be sure to keep this source http://tianhunyongheng.blog.51cto.com/1446947/1692949

Linux server hardware error, System abnormal restart detection-mcelog

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.